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Abstract 


For each function on bit strings, its restriction to bit strings of 
any given length can be computed by a finite instruction sequence 
that contains only instructions to set and get the content of Boolean 
registers, forward jump instructions, and a termination instruction. We 
describe instruction sequences of this kind that compute the function on 
bit strings that models multiplication on natural numbers less than 2% 
with respect to their binary representation by bit strings of length N, 
for a fixed but arbitrary N > 0, according to the long multiplication 
algorithm and the Karatsuba multiplication algorithm. One of the 
results obtained is that the instruction sequence expressing the former 
algorithm is longer than the one expressing the latter algorithm only if 
the length of the bit strings involved is greater than 2°. We also go into 
the use of an instruction sequence with backward jump instructions 
for expressing the long multiplication algorithm. This leads to an 
instruction sequence that it is shorter than the other two if the length 
of the bit strings involved is greater than 2. 


Keywords: bit string function, single-pass instruction sequence, back- 
ward jump instruction, long multiplication algorithm, Karatsuba mul- 
tiplication algorithm, halting problem. 


1 Introduction 


This paper belongs to a line of research in which issues relating to various sub- 
jects from computer science, including programming language expressiveness, 
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computability, computational complexity, algorithm efficiency, algorithmic 
equivalence of programs, program verification, program performance, pro- 
gram compactness, and program parallelization, are rigorously investigated 
thinking in terms of instruction sequences. An enumeration of most pa- 
pers belonging to this line of research is available at [11]. The work on 
computational complexity presented in [4] and the work on algorithmic 
equivalence of programs presented in [5] were prompted by the fact that, 
for each function on bit strings, its restriction to bit strings of any given 
length can be computed by a finite instruction sequence that contains only 
instructions to set and get the content of Boolean registers, forward jump 
instructions, and a termination instruction. 


This fact also incited us to look for finite instruction sequences containing 
only the above-mentioned instructions that compute a well-known function 
on bit strings of a given length. Earlier, we did so taking the hash function 
SHA-256 from the Secure Hash Standard [16] as the well-known function on 
bit strings. In the current paper, we do so taking the function that models 
multiplication on natural numbers less than 2% with respect to their binary 
representation by bit strings of length N, for a fixed but arbitrary N > 0, 
as the well-known function on bit strings. 


We describe finite instruction sequences containing only the above- 
mentioned instructions that compute this function according to the standard 
multiplication algorithm, which is known as the long multiplication algorithm, 
and according to the Karatsuba multiplication algorithm [9, 10]. We calculate 
the exact size of the instruction sequence expressing the long multiplication 
algorithm and lower and upper estimates for the size of the instruction 
sequence expressing the Karatsuba multiplication algorithm. One of the 
results following from the calculated sizes is that the instruction sequence 
expressing the former algorithm is longer than the instruction sequence 
expressing the latter algorithm only if the length of the bit strings involved 
is greater than 2°. 


We also go into the use of an instruction sequence with backward jump 
instructions for expressing the long multiplication algorithm. We describe 
a finite instruction sequence containing a backward jump instruction, in 
addition to the above-mentioned instructions, that expresses a minor variant 
of the long multiplication algorithm. We calculate the exact size of this 
instruction sequence and find that it is shorter than the other two if the 
length of the bit strings involved is greater than 2. In addition, we argue 
that the instruction sequences expressing the long multiplication algorithm 
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form a hard witness of the inevitable existence of a halting problem in the 
practice of imperative programming. 

The Karatsuba multiplication algorithm was devised by Karatsuba 
in 1962 to disprove the conjecture made by Kolmogorov that any algorithm 
to compute the function that models multiplication on natural numbers 
with respect to their representations in the binary number system has time 
complexity Q(n?). Shortly afterwards, this divide-and-conquer algorithm 
was generalized by Toom and Cook [7, 13]. Later, asymptotically faster 
multiplication algorithms, based on fast Fourier transforms, were devised 
by Schénhage and Strassen [12] and Fiirer [8]. To our knowledge, except 
for the Schénhage-Strassen algorithm, only informal (natural language or 
pseudo code) descriptions of these multiplication algorithms are available. In 
this paper, we provide a mathematically precise alternative to the informal 
descriptions of the Karatsuba multiplication algorithm, using terms from an 
algebraic theory of single-pass instruction sequences defined in [1]. 

It is customary that computing practitioners phrase their explana- 
tions of issues concerning programs from an empirical perspective such as 
the perspective that a program is in essence an instruction sequence. An 
attempt to approach the semantics of programming languages from this 
perspective is made in [1]. The groundwork for the approach is an algebraic 
theory of single-pass instruction sequences, called program algebra, and 
an algebraic theory of mathematical objects that represent the behaviours 
produced by instruction sequences under execution, called basic thread alge- 
bra.? The line of research referred to at the beginning of this introduction 
originates from the above-mentioned work on an approach to programming 
language semantics. 

The general aim of this line of research is to bring instruction sequences 
as a theme in computer science better into the picture. This is the general 
aim of the work presented in the current paper as well. However, different 
from usual in the work referred to above, the accent is this time mainly on a 
practical problem, namely the problem to devise instruction sequences that 
express the long multiplication algorithm and the Karatsuba multiplication 
algorithm. As in the work referred to above, the work presented in the 
current paper is carried out in the setting of program algebra. 

This paper is organized as follows. First, we survey program algebra 
and the particular fragment and instantiation of it that is used in this paper 


In [1], basic thread algebra is introduced under the name basic polarized process 
algebra. 
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(Section 2) and sketch the Karatsuba multiplication algorithm (Section 3). 
Next, we describe how we deal with n-bit words by means of Boolean registers 
(Section 4) and how we compute the operations on n-bit words that are used 
in the long multiplication algorithm and/or the Karatsuba multiplication 
algorithm (Section 5). Then, we describe and analyze instruction sequences 
that express these algorithms (Section 6). After this, we go into the use 
of an instruction sequence with backward jump instructions for expressing 
the long multiplication algorithm (Sections 7) and relate the findings to the 
halting problem (Section 8). Finally, we make some concluding remarks 
(Section 9). 

We rely in this paper on an intuitive understanding of what is an 
algorithm and when an instruction sequence expresses an algorithm. A 
rigorous study of these issues and related ones, carried out in the same 
setting as the work presented in this paper, is presented in [5]. 

The preliminaries to the work presented in this paper are a selection 
from the preliminaries to the work presented in [4]. For this reason, there is 
some text overlap with those papers. The preliminaries concern program 
algebra. We only give a brief summary of program algebra. A comprehensive 
introduction, including examples, can be found in [3]. 


2 Program Algebra 


In this section, we present a brief outline of PGA (ProGram Algebra) and 
the particular fragment and instantiation of it that is used in the remainder 
of this paper. A mathematically precise treatment can be found in [4]. 

The starting-point of PGA is the simple and appealing perception of 
a sequential program as a single-pass instruction sequence, i.e. a finite or 
infinite sequence of instructions of which each instruction is executed at 
most once and can be dropped after it has been executed or jumped over. 

It is assumed that a fixed but arbitrary set 2 of basic instructions 
has been given. The intuition is that the execution of a basic instruction 
may modify a state and produces a reply at its completion. The possible 
replies are 0 and 1. The actual reply is generally state-dependent. Therefore, 
successive executions of the same basic instruction may produce different 
replies. The set 2 is the basis for the set of instructions that may occur in the 
instruction sequences considered in PGA. The elements of the latter set are 
called primitive instructions. There are five kinds of primitive instructions, 
which are listed below: 
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e for each a € 2, a plain basic instruction a; 

e for each a € 2, a positive test instruction +a; 
e for each a € 2, a negative test instruction —a; 
e for each 1 € N, a forward jump instruction #1; 
e a termination instruction !. 


We write 3 for the set of all primitive instructions. 
On execution of an instruction sequence, these primitive instructions 
have the following effects: 


e the effect of a positive test instruction +a is that basic instruction a is 
executed and execution proceeds with the next primitive instruction 
if 1 is produced and otherwise the next primitive instruction is skipped 
and execution proceeds with the primitive instruction following the 
skipped one — if there is no primitive instruction to proceed with, 
inaction occurs; 


e the effect of a negative test instruction —a is the same as the effect 
of +a, but with the role of the value produced reversed; 


e the effect of a plain basic instruction a is the same as the effect of +a, 
but execution always proceeds as if 1 is produced; 


e the effect of a forward jump instruction #/ is that execution proceeds 
with the [th next primitive instruction of the instruction sequence 
concerned — if J equals 0 or there is no primitive instruction to 
proceed with, inaction occurs; 


e the effect of the termination instruction ! is that execution terminates. 


To build terms, PGA has a constant for each primitive instruction and 
two operators. These operators are: the binary concatenation operator ; 
and the unary repetition operator “. We use the notation *}'_) P;, where 
Po,..., Pn are PGA terms, for the PGA term Py;...; Ph. We also use the 
notation P”. For each PGA term P and n > 0, P” is the PGA term defined 
by induction on n as follows: P! = P and P"t! = P; P”. 

The instruction sequences that concern us in the remainder of this 
paper are the finite ones, i.e. the ones that can be denoted by closed PGA 
terms in which the repetition operator does not occur. Moreover, the basic 
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instructions that concern us are instructions to set and get the content of 
Boolean registers. More precisely, we take the set 


{in:i.get | i € NT} U {out:2.set:b | ie Nt Abe {0, 1}} 
U {aux:i.get | i € Nt} U {aux:i.set:b | ie Nt Abe {0,1}} 


as the set 21 of basic instructions. 

Each basic instruction consists of two parts separated by a dot. The 
part on the left-hand side of the dot plays the role of the name of a Boolean 
register and the part on the right-hand side of the dot plays the role of a 
command to be carried out on the named Boolean register. For each i € NT: 


e in:z serves as the name of the Boolean register that is used as ith input 
register in instruction sequences; 


e out:2 serves as the name of the Boolean register that is used as ith 
output register in instruction sequences; 


e aux:2 serves as the name of the Boolean register that is used as ith 
auxiliary register in instruction sequences. 


On execution of a basic instruction, the commands have the following effects: 


e the effect of get is that nothing changes and the reply is the content of 
the named Boolean register; 


e the effect of set:0 is that the content of the named Boolean register 
becomes 0 and the reply is 0; 


e the effect of set:1 is that the content of the named Boolean register 
becomes 1 and the reply is 1. 


Let n,m EN, let f:{0,1}" — {0,1}’", and let X be a finite instruction 
sequence that can be denoted by a closed PGA term in the case that 2 is 
taken as specified above. Then X computes f if there exists a k € N such 
that for all b1,...,b, € {0,1}: if X is executed in an environment with n 
input registers, m output registers, and & auxiliary registers, the content of 
the input registers with names in:1,...,in:n are b1,...,6, when execution 
starts, and the content of the output registers with names out:1,...,out:m 
are b},...,b/,, when execution terminates, then f(b1,...,6n) = 04,...,0),- 
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3 Sketch of Karatsuba Multiplication Algorithm 


Suppose that x and y are two natural numbers with a binary representa- 
tion of n bits. As a first step toward multiplying x and y, split each of 
these representations into a left part of length |n/2| and a right part of 
length [n/2]. Let us say that the left and right part of the representation 
of x represent natural numbers x; and xR and the left and right part of 
the representation of y represent natural numbers yz and yr. It is obvious 
that « = 2/"/21.¢, 4+ aR and y = 2!"/?1.y, + yg. From this it follows 
immediately that 


In addition to this, it is known that 


TL-yR+@R- YL =(tL+2R)-(YL+¥yR)—@L‘YL—TR-YR- 


Moreover, it is easy to see that multiplications by powers of 2 are merely 
bit shifts on the binary representation of the natural numbers involved. 
All this means that, on the binary representations of x and y, the multi- 
plication x - y can be replaced by three multiplications: rp, - yz, cR- yR, 
and (a, + rR): (yz + yr). These three multiplications concern natural 
numbers with binary representations of length |n/2|, [n/2], and [n/2] + 1, 
respectively. For each of these multiplications it holds that, if the binary 
representation length concerned is greater than 3, the multiplication can be 
replaced by three multiplications of natural numbers with binary representa- 
tions of even shorter length. 

The Karatsuba multiplication algorithm is the algorithm that computes 
the binary representation of the product of two natural numbers with binary 
representations of the same length by dividing the computation into the 
computation of the binary representations of three products as indicated 
above and doing so recursively until it not any more leads to further length 
reduction. The remaining products are usually computed according to the 
standard multiplication algorithm, which is known as the long multiplication 
algorithm. 

Both the Karatsuba multiplication algorithm and the long multiplication 
algorithm can actually be applied to natural numbers represented in the 
binary number system as well as natural numbers represented in the decimal 
number system. The long multiplication algorithm is the multiplication 
algorithm that is taught in schools for computing the product of natural 
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numbers represented in the decimal number system. It is known that 
the long multiplication algorithm has uniform time complexity @(n?) and 
the the Karatsuba multiplication algorithm has uniform time complexity 
@(nlee23)) = O(n)5849--), so the Karatsuba multiplication algorithm is 
asymptotically faster than the long multiplication algorithm. 


4 Dealing with n-Bit Words 


This section is concerned with dealing with bit strings of length n by means 
of Boolean registers. It contains definitions which facilitate the description 
of instruction sequences that express the long multiplication algorithm and 
the Karatsuba multiplication algorithm. 

Henceforth, it is assumed that a fixed but arbitrary positive natural 
number N has been given. The above-mentioned algorithms compute the bi- 
nary representation of the product of two natural numbers represented by bit 
strings of the same length. In Section 6, the instruction sequences expressing 
these algorithms will be described for the case where this length is N. 

In the sequel, bit strings of length n will mostly be called n-bit words. 
The prefix “n-bit” is left out if n is irrelevant or clear from the context. 

Let K:i (K € {in, out, aux}, i € N*) be the name of a Boolean register. 
Then « and 7 are called the kind and number of the Boolean register. 
Successive Boolean registers are Boolean registers of the same kind with 
successive numbers. Words are stored by means of Boolean registers such 
that the successive bits of a stored word are the contents of successive 
Boolean registers. 

Henceforth, the name of a Boolean register will mostly be used to refer 
to the Boolean register in which the least significant bit of a word is stored. 
Let «:7 and «/:i’ be the names of Boolean registers and let n € N*. Then we 
say that «:2 and k’/:i! lead to partially coinciding n-bit words if k = k’ and 
0<|i-v| <n. 

The N-bit words representing the two natural numbers for which the 
binary representation of their product is to be computed are stored in advance 
of the computation in input registers, starting with the input register with 
number 1. It is convenient to have available the names J; and I» for the 
input registers in which the least significant bit of these words are stored. 
The 2N-bit word representing the product is stored just before the end 
of the computation in output registers, starting with the output register 
with number 1. It is convenient to have available the name O for the 
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output register in which the least significant bit of this word is stored. The 
words representing intermediate values that arise during the computation are 
temporarily stored in auxiliary registers, starting with the auxiliary register 
with number 1. 


In the case of the Karatsuba algorithm, the binary representation of 
the product of two natural numbers with binary representations of the same 
length is computed by dividing the computation into the computation of the 
binary representations of three products and doing so recursively until it not 
any more leads to further length reduction. Therefore, it is convenient to have 
available, for sufficiently many natural numbers i, the names Ji, [5 and O* 
for the auxiliary registers in which the least significant bit of the binary 
representations of smaller natural numbers and their product are stored. 
Because at each level of recursion, except the last level, the computation 
of the binary representation of a product involves the computation of the 
binary representations of three products at the next level, it is convenient to 
have available, for sufficiently many natural numbers i, the names P?, Pi 
and Pi for the auxiliary registers in which the least significant bit of these 
binary representations of products are stored. 


It is also convenient to have available the names 5S}, S,7;1,7> for the 
auxiliary registers in which the least significant bit of the words that represent 
the intermediate values that arise, other than the ones mentioned in the 
previous paragraph, are stored. Moreover, it is convenient to have available 
the name c for the auxiliary register that contains the carry/borrow bit that 
is repeatedly stored when computing the operations that model addition and 
subtraction on natural numbers with respect to their binary representation. 


Therefore, we define: 


in:1, 
ink = wherek = N +1, 
O = out:1, 


aux:1, 


So &aux:k wherek =2-N +42, 
T, = aux:k wherek =4-N +42, 
To = aux:k wherek =6-N +2, 
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Ti Saux:k wherek=10-N-i+8-N+2 (0<i< [logy(N —2)]), 
Ti Saux:k wherek=10-N-i+9-N+2 (0<i< [log,(N — 2)]), 
O' 2 aux:k wherek =10-N-i+10-N+2 (0<i< flog.(N —2)]), 
Pi 5 aux:k wherek=10-N-i+12-N+2 (0<i< Jlogo(N —2)]), 
Pi 2 aux:k wherek =10-N-i+14-N+2 (0<i< Jlogo(N —2)]), 
Pi 5 aux:k wherek=10-N-i+16-N+2 (0<i< [logg(N — 2)]). 


Here 7 ranges over natural numbers in the interval with lower endpoint 0 
and upper endpoint [log,(V — 2)]. This needs some explanation. 


Proposition 1 The recursion depth of the Karatsuba multiplication algo- 
rithm applied to bit strings of length N is |logs(N — 2)]. 


Proof: Let n < N. In the Karatsuba multiplication algorithm, the 
binary representation of the product of two natural numbers with binary 
representations of length n is computed by dividing the computation into the 
computation of the binary representation of a product of two natural numbers 
with binary representations of length |n/2|, the binary representation of 
a product of two natural numbers with binary representations of length 
[n/2], and the binary representation of a product of two natural numbers 
with binary representations of length [n/2] +1. The function f defined by 
f(n) & [n/2] +1 has the following properties: (a) f(n) <n iff n > 3; and 
(b) for n > 3, the least m such that f(n) = 3 is [logg(n — 2)|. This implies 
that the recursion depth is [logy(N — 2)]. 


Proposition 1 tells us that the maximum level of recursion that can be 
reached is [logy(V — 2)]. So there are [log,(N — 2)] + 1 possible levels of 
recursion, viz. 0, ..., [logg(NV — 2)]. This means that there are sufficiently 
many natural numbers 7 for which the names I ‘a T. O'" rey ; Pe and rs have 
been introduced above. In Section 6, we will use the names I/, IS, O*, P?, Pi, 
and Pi at the level of recursion [logy(N — 2)] — i. 


5 Computing Operations on n-Bit Words 


This section is concerned with computing operations on bit strings of length n. 
It contains definitions which facilitate the description of instruction sequences 
that express the long multiplication algorithm and the Karatsuba multipli- 
cation algorithm. 
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In this section, we will write 66’, where 8 and (’ are bit strings, for 
the concatenation of 3 and 6’. In other words, we will use juxtaposition for 
concatenation. Moreover, we will use the bit string notation b”. For n > 0, 
the bit string b”, where b € {0,1}, is defined by induction on n as follows: 
b! = b and b+! = bb”. 

The basic operations on words that are relevant to the long multipli- 
cation algorithm and/or the Karatsuba multiplication algorithm are the 
operations that model addition, subtraction, and multiplication by 2”, 
modulo 2”, on natural numbers less than 2”, with respect to their binary 
representation by n-bit words (0 < n < N,0< m<_n). The operation 
modeling multiplication by 2” is commonly known as “shift left by m posi- 
tions”. For these operations, we define parameterized instruction sequences 
computing them in case the parameters are properly instantiated (see below): 


ADD) (s1:k1, 82:k2, dil) 
c.set:0 ; 
no (+s 1:ki ti.get ; #8 ; +so:ko+i.get ; #8 ; —c.get ; #14; 
d:l+i.set:1 ; c.set:0 ; #13 ; +sq:ko+7.get ; #4; +c.get ; #7; 47; 
+c.get ; #5 ; d:l+i.set:0 ; c.set:1 ; #3 ; +d:l+i.set:0 ; d:l+i.set:1) , 


) 


SUB»(81:k1, 82:k2, d:l) = 
c.set:0 ; 
(—s1:k,+i.get ; #8 ; +so:ko+1.get ; #8 ; —c.get ; #14; 
d:l+i.set:0 ; c.set:0 ; #13 ; +sq:ko+7.get ; #4; +c.get ; #7; 47; 
+c.get ; #5 ; d:l+i.set:1 ; c.set:1 ; #3 ; —d:l+i.set:1 ; d:l+i.set:0) , 


en-1 
3 i=0 


SHL™ (s:k,d:l) = 
so sik+n—1—m—i.get ; —d:l+n—1—i.set:1 ; d:l+n—1—i.set:0) ; 


sino (d:l-+m—1—i.set:0) 


where s, $1, $2 range over {in, aux}, d ranges over {aux, out}, and k, ky, ka, 
range over N*. For each of these parameterized instruction sequences, all but 
the last parameter correspond to the operands of the operation concerned 
and the last parameter corresponds to the result of the operation concerned. 
The intended operations are computed provided that the instantiation of 
the last parameter and the instantiation of none of the other parameters 
lead to partially coinciding n-bit words. In this paper, this condition will 
always be satisfied. 
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In the case of addition and subtraction, the intended operation is 
computed according to the long addition algorithm and the long subtraction 
algorithm, respectively. There are many instruction sequences expressing 
these algorithms. The ones defined above are at present the shortest ones 
that we could devise. 

From now on, if we state that a function on bit strings of length n 
models a function on natural numbers less than 2”, it is implicit that it 
does so with respect to the binary representation of these numbers by n-bit 
words. 


Proposition 2 Let n,m € N be such thatO <n < N andO<m<_n. 
Then the function on bit strings of length n computed by 


1. ADD, (11, I2,O);! models addition modulo 2” on natural numbers less 
than 2”; 


2. SUB, (hh, I2,O);! models subtraction modulo 2” on natural numbers 
less than 2”; 


3. SHL(I,,O) ;! models multiplication by 2™ modulo 2” on natural 
numbers less than 2”. 


Proof: In the case of the first and second property, we prove a stronger 
property that also covers the final content of the auxiliary register containing 
the carry/borrow bit. Each of the stronger properties is easy to prove by 
induction on n with case distinction on the contents of the input registers 
containing the most significant bits of the operands of the operation concerned 
and the content of the auxiliary register containing the carry/borrow bit in 
both the basis step and the inductive step. The third property is easy to 
prove by induction on n with case distinction on the content of the input 
register containing the most significant bit of the operand of the operation 
concerned in both the basis step and the inductive step. 


Transferring n-bit words (0 < n < N) is also relevant to the multiplica- 
tion algorithms. For this, we define parameterized instruction sequences as 
well. By one the successive bits in a constant n-bit word become the content 
of n successive Boolean registers and by the other the successive bits in a 
n-bit word that are the content of n successive Boolean registers become the 
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content of n other successive Boolean registers: 
SET n(bo .--bp_1, 4:1) = sino (d:l-+i.set:bj) 


MOV ,(s:k, d:l) = sro (+8:k-+i.get ; —d:l+i.set:1 ; d:l+i.set:0) , 
where bo,...,bn—1 range over {0,1}, s ranges over {in, aux}, d ranges over 
{aux, out}, and k,l range over N*. In the case of MOV, the intended 
transfer is performed provided that the instantiation of the last parameter and 
the instantiation of the first parameter do not lead to partially coinciding n- 
bit words. In this paper, this condition will always be satisfied. 


Proposition 3 Letn€N be such thatO <n< N. Then the function on 
bit strings of length n computed by 


1. SET), (bo... bn—1,O);! models the natural number constant with binary 
representation bo... bn—1; 


2. MOV n(h,O);! models the identity function on natural numbers less 
than 2”. 


Proof: Each of these properties is trivial to prove by induction on n with 
case distinction on 6,_; and the content of the input register containing the 
most significant bits of the operand of the operation, respectively, in both 
the basis step and the inductive step. 


For convenience’s sake, we define some special cases of the parameterized 
instruction sequences for transferring n-bit words (0 <_m <n): 


ZPAD™(d:l) = SET pn (0 ™, d:l+m) , 
MVH™(s:k, d:l) = MOV m(s:k+(n—m), dil) , 
MVL™ (s:k, d:l) = MOV »,(s:k, d:l) , 


where s ranges over {in, aux}, d ranges over {aux, out}, and k,] range over N*. 
ZPAD* is meant for turning a stored m-bit word into a stored n-bit word by 
zero padding. MVH}” and MVL*” are meant for transferring only the m most 
significant bits and the m least significant bits, respectively, of a stored n-bit 
word. 

Because [n/2]+1 < niffn > 3, the Karatsuba multiplication algorithm 
cannot be used for modeling multiplication on natural numbers less than 
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2” with respect to their binary representation by n-bit words if n < 3. 
Therefore, we also define a parameterized instruction sequence, in terms of 
the above-mentioned basic operations, that computes the operation modeling 
multiplication according to the long multiplication algorithm: 


MUL*y(s81:h1, 82:k2, dil) + 
MOV (s1:k1, $1) ; ZPAD2,(S1) ; SET on(02", So) ; 
iat ; | (—so:ko+i.get ; #1; ; ADDn+i41(S1, S2, $2) ; SHILA ,4.4(81,91)) 3 
MOV (52, 4:1) , 

where |; = len(ADDy4441(51, S2, S2)) +1 


where $1, 52 range over {in, aux}, d ranges over {aux, out}, and ky, ko,/ range 
over Nt. The additions are done on the fly and the shifts are restricted to 
shifts by one position by shifting the result of all preceding shifts. 


Proposition 4 Letn€N be such thatO <n< N. Then the function on 
bit strings of length n computed by MUL, (Kh, I2,O) ;! models multiplication 
on natural numbers less than 2”. 


Proof: We prove a stronger property that also covers the final contents 
of the 2n successive auxiliary registers starting with the one named S$; and 
the 2n successive auxiliary registers starting with the one named $3. This 
stronger property is easy to prove, using Propositions 2 and 3, by induction 
on n with case distinction on the content of the input register containing 
the most significant bit of the second operand of the operation concerned in 
both the basis step and the inductive step. 


The calculation of the lengths of the parameterized instruction sequences 
defined above is a matter of simple additions and multiplications. The lengths 
of these instruction sequences are as follows: 


len(SHL?"(s:k, d:l)) =3-n—2-m, 
len(ADDp(s1:k1, 82:k2,d:l)) =21-n4+1, 
len(SUB,,(s1:k1, s2:k2, d:l)) = 21-n+1, 
len(SET n (bo... bn—1, d:1)) =n 

len( MOV =e San. 


Instruction Sequences Expressing 
Multiplication Algorithms 53 


len(ZPAD?(d:l)) =n-—m, 
len(MVH?"(s:k,d:l)) =3-m, 
( 

( 


MVL" (s:k,d:l)) =3-m, 
len(MULy(81:k1, 89:k2, d:l)) = 36-n? + 24-n4+1. 


len 


The instruction sequences defined in this section do compute the in- 
tended operations in case of fully coinciding n-bit words. 


6 Long Multiplication and 
Karatsuba Multiplication 


In this section, we describe and analyze instruction sequences that express the 
long multiplication algorithm and the Karatsuba multiplication algorithm, 
using the definitions given in Sections 4 and 5. The latter algorithm is 
applicable only if N > 3. 

LMULn is the instruction sequence described by 


MULN (Ki, Iz, O) ; ae 


We know by Proposition 4 that LMULy computes the function on bit strings 
that models multiplication on natural numbers less than 2%. It does so 
according to the long multiplication algorithm. 


Proposition 5 len(LMULy) = 36-N?+24-N +2. 


Proof: This is trivial because len(LMULy) =len(MULn (Ni, I2,O))+1. 


KMULIn, where N > 3, is the instruction sequence described by 


MOV y(h, 2%!) ; MOV n(2, SP —)) 
KMAn ; MOV2y(Ol82(N—2)1_ 0) 5 1, 


where KMA,, is inductively defined in Table 1. KMULy computes the 
function on bit strings of length N that models multiplication on natural 
numbers less than 2" according to the Karatsuba multiplication algorithm. 

In order to compute the binary representation of the product of two 
natural numbers with binary representations of length n by dividing the 
computation into the computations of the binary representations of three 
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Table 1: Definition of KMA, (1 <n < N) 


if n < 3 then: 


KMA,, = MUL, (i, 6,0) , 


if n > 3 then: 

KMA, = 
Maver ye’? (£0, 76l™/2))) «pve rll (8), 78/21), 
KMAn/2| ; MOV a1n/2) (OC), PE) ; 
evel! 1°), 120°/2) apy (2), 10/20 , 
KMApy/2] } MOV apn/x (O87, PL) ; 
MVHe!?) (76 11); zPAD?! | (T,); 


[n/2]+1 

uv Ga T2) ; ZPAD| | (To) ; ADD jnjaj4i(N, Ta, ua) 
MVHY) (1, T,); ZPADIM (Ti); 

uv Ge T2) ; ZPAD| il (Te) ; ADD jnjaj4i(N, Ta, pee) 
KMAjp/2]41 3 MOVi i ior, ee 

TPAD en JxCPAD a ees) 


n n L(n 
SUB ytnjayasy(P5! ne ays SUB ytm/o}41y(Ti, Py’ Pee 


SHiZin/7| (pi) 7); sui? | (7, 1); 


ADD 2n( To, Th, T1) ’ ADD on (Ti, PEO, of") ’ 
where £(m) = [logs(m — 2)]. 


products as required by the Karatsuba multiplication algorithm, the in- 
struction sequence KMA, contains the instruction sequences KMA),,/9), 
KMA/p/2}, and KMA;,/2)41. Each of these three instruction sequences is 
immediately preceded by an instruction sequence that transfers the binary 
representations of the two natural numbers of which it has to compute 
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the binary representation of their product into the appropriate Boolean 
registers for the instruction sequence concerned. Moreover, each of these 
three instruction sequences is immediately followed by an instruction se- 
quence that transfers the binary representation of the product that it has 
computed into the appropriate Boolean registers for KMA,. The tail end of 
KMA,, completes the computation by performing some operations on the 
three binary representations of products computed before as required by the 
Karatsuba multiplication algorithm. For the rest, instruction sequences for 
zero padding are scattered over KMA, where necessary to obtain the locally 
right length of binary representations of natural numbers. 


Proposition 6 If N > 3, then the function on bit strings of length N 
computed by KMULnN models multiplication on natural numbers less than 2 . 


Proof: It is straightforward to prove this by induction on N, using the 
equations from Section 3 that form the basis of the Karatsuba multiplication 
algorithm and Propositions 2, 3, and 4. 


The following proposition gives a lower estimate and an upper estimate 
for the length of KMULy. 


Proposition 7 If N > 3, then: 


len(KMULy) > 1184. 3llee2(%)J-1 — 716 . QUee2)I-1 4.12. N—70, 
len(KMULN) < 1005 - 3!82(N—2)1 _ 358 . gllosoV—-2)1 + 19. N — 249. 


Proof: Because len(KMULy) = len(KMAn) + 12: N +1, we have to 
prove that 


len(KMAy) > 1184. 3Ue82)J-1 _ 716 . gllcee2(M)J-1 _ 71, 
len(KMAy) < 1005 - 3/!082(N—2)] — 358 . gltog2(N—2)] _ 950 . 


Let c, = len(MULI}), co = len(MULz), cz = len(MUL3), and for each n > 3, 
Cn = len(KMA,,) — len( KMA),,/2)) — len( KMAjy 2) — len( KMAjp/o} 41). Us- 
ing the already calculated lengths of the parameterized instruction sequences 
defined in Section 5, we obtain by simple calculations that c, = 61, co = 193, 
c3 = 397, and for each n > 3, cn = 126- [n/2] + 116-n +142. Let ch = cs, 
Ch = cz, and for each m > 0, c, = comy and ci, = cyom+i. In other 
words, cy = 397, cj) = 397, and for each m > 0, d,, = 358 - 2”~1 + 500 
and c’, = 358-2 +142. Because |r| = k iffk <2 <k+1, [rz] =k 
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iffk—1<a<k, and logo(x) = y iff x = 2%, it is clear that cn < ci, if 
m = [logs(n — 2)] and c, > cy, ifm = |logs(n)| — 1. 

Let M = [log,(.N — 2)], and let m < M. It follows directly from the 
proof of the proposition at the end of Section 4 that, for all n such that m = 
[logs(n — 2)], the deepest level of recursion at which KMA,, occurs is M—m. 
Moreover, it follows directly from the definition of KMA,, that, for all n > 0, 
KMAy, occurs at this level only if n is less than or equal to the greatest n’/ such 
that m = [logs(n’ — 2)]. We also have that cn < cy ifn <n’, and cy < Cc, 
if m = [log,(n’ — 2)|. All this means that len(KMAy) < le ve ee 
In other words, len(KMAwy) < 397-3” + 0™, ((358 - 2'-1 + 500) - 3@-*). 
Using elementary properties of sums and the property that yey zg = 
(1 — w'+1)/(1 — x), we obtain 397-3” + 3°™ ((358 - 2'-! + 500) -3@-4) = 
397 «3 +358. (3 — 9") 4500: (3 = 1)/2) = 1005-3" = 358.2" = 
250. Hence, because M = [log,(N — 2)], len(KMAy) < 1005-3! les2(V—-2)] — 
358 - 2!loga(V—2)1 _ 250. 

Let M’ = |log.(N)| —1, and let m < M’. We can show similarly to 
above that, for all n such that m = |log,(n)| — 1, the least deep level of 
recursion at which KMA, occurs is M’ — m. Moreover, it follows directly 
from the definition of KMA, that, for all n > 0, KMA, occurs at this level 
only if n is greater than or equal to the least n’ such that m = |logg(n’)| — 1. 
We also have that ¢, >¢py ifn>n’, and cy > cy), if m = |logs(n’)|—1. All this 
means that len(KMAy) > Se -3™'-7), In other words, len(KMAy) > 
397-3! 4 yo ((358 -2'4+142)-3™'-), Using the same properties of sums 
as before, we obtain 397-3" + 52M ((358 - 284.142). 3M’-4) = 397.3 4 
358 - (2. (3! — 2™°)) 4. 142. (3 — 1)/2) = 1184-3” — 716-2” — 71. 
Hence, because M’ = |logy(N)|—1, len(KMAy) > 1184-3Ue82()J-1_ 716. 
glloge(N)J-1 _ 71, 


It is unclear to us whether it is practically possible to improve the lower 
estimate and upper estimate for the length of KMULy considerably. 
The following is a corollary of Propositions 5 and 7. 


Corollary 1 len(LMULN) = O(N?) and len(KMULn) = O(N220)) = 
OSE), 


This corollary can be paraphrased as follows: the length of the instruction 
sequences LMULy and KMULIy, which express the long multiplication 
algorithm and the Karatsuba multiplication algorithm, are asymptotically 
bounded, up to a constant factor, both above and below by N? and N!°82(3), 
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respectively. It is striking because these algorithms are known to compute 
the function that models multiplication on natural numbers less than 2% 
with respect to their binary representation by N-bit words also in time 
asymptotically bounded, up to a constant factor, both above and below 
by N? and N'22(3), respectively. This suggests, like some results from [4], 
that instruction sequence size and computation time are polynomially related 
measures. 

Using Propositions 5 and 7, it is easy to check that (a) LMULy is 
longer than KMULy only if N > 264 and (b) LMULy is longer than 
KMULn if N > 6666. On that account, the following is another corollary of 
Propositions 5 and 7. 


Corollary 2 N > 2° if len(ZLMULN) > len(KMULy) and len(LMULN) > 
len(KMULN) if N > 28. 


In the area of algorithm efficiency, like in the area of computational com- 
plexity, the focus is mainly on asymptotic properties of algorithms, like 
Corollary 1. To our knowledge, there is virtually no attention in this area to 
properties related to crossover points between algorithms, like Corollary 2. 
We think that properties of the latter kind are frequently more relevant to 
practice than properties of the former kind. However, existing knowledge 
about crossover points between algorithms is mainly based on experimental 
data which are highly dependent on the computer, operating system, pro- 
gramming language and compiler used in the experiment. Moreover, if this 
kind of knowledge is referred to at all, it is often turned into the form of 
a rule of thumb. For example, the following statement and minor variants 
of it can be found at many places (webpages, articles, and books) without 
further justification: “As a rule of thumb, Karatsuba is usually faster when 
the multiplicands are longer than 320-640 bits” (see e.g. [15]). 

It is obvious that LMULy and KMULy need the same number of input 
registers and the same number of output registers. However, the number of 
auxiliary registers used by KMULy is always greater than the number of 
auxiliary registers used by LDMULy. The number of auxiliary registers used 
by KMULn is 10-N- [logo(N — 2)] +18-N-+1 and the number of auxiliary 
registers used by LMULy is only 4: N +1. In the instance that N = 2°, 
these numbers correspond to +3K bytes and +128 bytes, respectively; and 
in the instance that N = 2!°, these numbers correspond to +148K bytes 
and +4K bytes, respectively. 

In this paper, we do not answer the question whether there exist in- 
struction sequences shorter than LDMULy and KMULy that express the long 
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multiplication algorithm and Karatsuba multiplication algorithm, respec- 
tively. The practical problem with proving or disproving the existence of 
shorter instruction sequences is that it needs basically an extremely extensive 
case distinction. We expect that, if the length of ZMULy and/or KMULy 
can be reduced, it cannot be reduced much. The reason for this is that 
we have striven in Section 5 for instruction sequences without unreachable 
subsequences, different suffixes with the same behaviour on execution, and 
jump instruction that can be eliminated without introducing different suffixes 
with the same behaviour on execution. 


7 Long Multiplication and 
Backward Jump Instructions 


In this section, a minor variant of the long multiplication algorithm is ex- 
pressed by an instruction sequence that contains a backward jump instruction 
in addition to instructions to set and get the content of Boolean registers, 
forward jump instructions, and a termination instruction. 


We use the fragment without repetition operator of an extension of 
PGA with, for each 1 € N, a backward jump instruction \#l as additional 
primitive instruction. On execution of an instruction sequence, the effect 
of a backward jump instruction \#l is that execution proceeds with the 
ith previous primitive instruction of the instruction sequence concerned — 
if 1 equals O or there is no primitive instruction to proceed with, inaction 
occurs. We write PGA}; for the above-mentioned extension of PGA. For a 
mathematically precise treatment of PGA»; without repetition operator, we 
refer to the treatment of C, which is a variant of PGA, in [6]. The fragment 
of PGA}; without the repetition operator coincides with the fragment of C 
without backward instructions other than backward jump instructions. 


The additional basic operations on words that are relevant in this section 
are the operations that model Euclidean division by 2, decrement by 1, 
and nonzero test on natural numbers less than 2”, with respect to their 
representation by n-bit words (0 < n < N,0< m<_n). The operation 
modeling Euclidean division by 2” is commonly known as “shift right by 
m positions”. For these operations, we define parameterized instruction 
sequences computing them in case the parameters are properly instantiated 
(see below): 
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SHR™ (s:k,d:l) 
sino" (+s:k+m-+ti.get ; —d:l+i.set:1 ; d:l+7.set:0) ; 
reo (d:l-+n—m-+i.set:0) ; 
DEC,,(s:k, d:l) = 


sino (—s:k-+i.get ; #3; d:l+i.set:0 ; #5; d:lti.set:1);#1;#1;#1, 


ISNZ n(s:k) = 
sia (+8:k-+i.get » #2); #2, 


where s ranges over {in, aux}, d ranges over {aux, out}, and k,/ range over N*. 
For each of the first two parameterized instruction sequences, the first 
parameter correspond to the operand of the operation concerned and the 
second parameter corresponds to the result of the operation concerned. The 
intended operations are computed provided that the instantiation of the first 
parameter and the instantiation of the second parameters do not lead to 
partially coinciding n-bit words. In this section, this condition will always 
be satisfied. No result is stored on execution of JISNZ,,. Instead, the first 
primitive instruction following [SNZ,, is skipped if the nonzero test fails. 


Proposition 8 Let n,m € N be such thatO <n < N andO<m<_n. 
Then the function on bit strings of length n computed by 


1. SHR?" (I,,O);! models Euclidean division by 2™ modulo 2” on natural 
numbers less than 2”; 


2. DEC (1, O) ;! models subtraction by 1 modulo 2” on natural numbers 
less than 2”; 


3. ISNZ,,(11) ; +O.set:1 ; O.set:0 ;! models the function isnz from natu- 
ral numbers less than 2” to natural numbers less than 2! defined by 
isnz(0) = 0 and isnz(k +1) =1 with respect to their binary represen- 
tation by n-bit words and 1-bit words, respectively. 


Proof: Each of these properties is easy to prove by induction on n with 
case distinction on the content of the input register containing the most 
significant bit of the operand of the operation concerned in both the basis 


step and the inductive step. 


The lengths of the parameterized instruction sequences defined above 
are as follows: 
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len(SHRY"(s:k,d:l)) =3-n—2-m, 
len(DEC,,(s:k, d:l)) =5-n+3, 
len(ISNZ,(s:k)) =2-n+1. 


For each bit of the representation of the multiplier, LMULy contains a 
different instruction sequence. This seems to exclude the use of backward 
jump instructions to obtain an instruction sequence of significantly shorter 
length, unless provision is made for some form of indirect addressing for 
Boolean registers. However, there exists a minor variant of the long mul- 
tiplication algorithm that makes it possible to have the same instruction 
sequence for each bit of the representation of the multiplier. From the least 
significant bit of the representation of the multiplier onwards, the algorithm 
concerned shifts the representation of the multiplier by one position to the 
right after it has dealt with a bit. In this way, the next bit remains the least 
significant one throughout. 

We proceed with describing an instruction sequence without backward 
jump instructions that expresses this minor variant of the long multiplication 
algorithm. 

LMUL\, is the instruction sequence described by 


MOV nN(h, $1) ; ZPAD}\, ($1) ; SET 2n (02%, Sz) ; MOV N(Io, 71) ; 
(—T,.get ; #1; ADD2Nn (Si, S2, S2) ; SHAN (S1, $1) ; SHRL(T1,T1))* ; 
MOV 2Nn(S2, O) ; ) 


where 


l= len(ADD2n (S41, $2, S2)) +1=42-N42. 


Proposition 9 The function on bit strings of length N computed by LMUL'y 
models multiplication on natural numbers less than 2. 


Proof: We prove a stronger property that also covers the final contents 
of the 2N successive auxiliary registers starting with the one named $j, 
the 2N successive auxiliary registers starting with the one named S2, and 
the N successive auxiliary registers starting with the one named 7). This 
stronger property is straightforward to prove, using Propositions 2, 3, and 8, 
by induction on N with case distinction on the content of the input register 
containing the most significant bit of the second operand of the operation 
concerned in both the basis step and the inductive step. 
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Proposition 10 len(ZMUL',) =51.N?+14-N+4+1. 


Proof: This is a matter of simple additions, subtractions, and multiplica- 
tions. 


The following is a corollary of Propositions 5 and 10. 
Corollary 3 len(ZMUL\,) > len(ZMULy). 


For each bit of the representation of the multiplier, LMUL‘, contains 
the same instruction sequence. That is, it contains N duplicates of the same 
instruction sequence. This duplication can be eliminated by implementing a 
loop by means of a backward jump instruction. 

We proceed with describing an instruction sequence with a backward 
jump instruction that expresses the minor variant of the long multiplica- 
tion algorithm. We write N for the shortest representation of the natural 
number N in the binary number system. 

LMUL'y is the instruction sequence described by 


MOV n(h, $1) ; ZPAD3N(S1) ; SET on (02%, S2) ; MOV n(Io, T1) ; 
SET jtog,(N)|+1(N, T2) ; 

—T}.get ; #1, ; ADD2Nn(S1, S2, $2) ; SHL4y ($1, $1) ; SHRK(T), 71) ; 
DEC |tog,(N)|+1(22, Ta) 5 ISNZ |og,()|-41(L2) 5 \F#l2 5 

MOV 2n (59, O) 5 ! 5 


where 


ly = len(ADD2n(S1, So, S2)) +1=42-N+2, 
lg = len(—T}.get ;...; ISNZ1o¢,(w)|41(T2)) = 51-N +7- [logo(N)| +10. 


Proposition 11 The function on bit strings of length N computed by 
LMUL*\, models multiplication on natural numbers less than 2%. 


Proof: We prove a stronger property that also covers the final contents of 
the 2N successive auxiliary registers starting with the one named 5S}, the 2N 
successive auxiliary registers starting with the one named S9, the N successive 
auxiliary registers starting with the one named 7), and the |log,(N)| +1 
successive auxiliary registers starting with the one named JT. This stronger 
property is straightforward to prove, using Propositions 2, 3, and 8, by 
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induction on N with case distinction on the content of the input register 
containing the most significant bit of the second operand of the operation 
concerned in both the basis step and the inductive step. 


Proposition 12 len(LMUL') = 66-N +8: |logs(N)| +13. 


Proof: This is a matter of simple additions, subtractions, and multiplica- 
tions. 


The following is a corollary of Propositions 5, 10, and 12. 


Corollary 4 len(ZMUL‘,) = O(N) while both len(LMULN) = Q(N?), and 
len(LMUL\,) = O(N2). 


Hence, LMUL‘, is asymptotically shorter than both LMULy and LMUL\,. 
By Corollary 1, we know that LMUL', is asymptotically shorter than 
KMULn too. 

The following is a corollary of Propositions 5, 7, 10, and 12. 


Corollary 5 Both len(ZMUL',) < len(LMULn) and len(LMUL',) < 
len(ZMUL\,) if N > 1, and what is more, len(LMUL',) < len(KMULn) 
if N > 2. 


Hence, LMUL'x, is already shorter than LMULy, LMUL', and KMULy 
if N is still very small. In fact, long multiplication is non-trivial only if N > 1 
and Karatsuba multiplication is applicable only if N > 2. 


8 Long Multiplication and the Halting Problem 


In this section, we argue that the instruction sequences LMUL/y, and LMUL', 
from Section 7 form a hard witness of the inevitable existence of a halting 
problem in the practice of imperative programming. 

Turing’s result regarding the undecidability of the halting problem (see 
e.g. [14]) is a result about Turing machines. In [2], we consider it as a 
result about programs rather than machines, taking instruction sequences 
as programs. The instruction sequences concerned are essentially the finite 
instruction sequences that can be denoted by closed PGA}; terms. Unlike 
in the current paper, the basic instructions are not fixed, but their effects 
are restricted to the manipulation of something that can be understood as 
the content of the tape of a Turing machine with a specific tape alphabet, 
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together with the position of the tape head. Different choices of basic 
instructions give rise to different halting problem instances and one of these 
instances is essentially the same as the halting problem for Turing machines. 
Because of their orientation to Turing machines, we consider all instances 
treated in [2] theoretical halting problem instances. 


All halting problem instances would evaporate if the instruction se- 
quences concerned would be restricted to the ones without backward jump 
instructions. This is irrespective of whether the effects of the basic instruc- 
tions have anything to do with the manipulation of a Turing machine tape. 
In the case that we have basic instructions to set and get the content of 
Boolean registers, instruction sequences without backward jump instructions 
are sufficient to compute all functions f : {0,1}” > {0,1}” (n,m € N). This 
raises the question whether there exists a good reason for not abandoning 
backward jump instructions altogether in such cases. The function that 
models multiplication on natural numbers less than 2" with respect to their 
binary representation by N-bit words offers a good reason: the length of 
the instruction sequence that computes it according to the long multiplica- 
tion algorithm can be reduced significantly by the use of backward jump 
instructions. The length of the instruction sequence that computes this 
function can be reduced even more by the use of backward jump instructions 
than by going over to one of the multiplication algorithms that are known 
to yield shorter instruction sequences without backward jump instructions 
than the long multiplication algorithm such as for example the Karatsuba 
multiplication algorithm. 


Thus, the instruction sequences LMUL'y and LMUL'y form a hard 
witness of the inevitable existence of a halting problem in the practice 
of imperative programming, where programs must have manageable size. 
Because of its orientation to actual programming, we consider the halting 
problem for the instruction sequences with forward and backward jump 
instructions, and with only basic instructions to set and get the content of 
Boolean registers, a practical halting problem. It is unknown to us whether 
there is a connection between the solvability or unsolvability of the halting 
problem for these instruction sequences and some form of diagonal argument. 
It is easy to prove that this halting problem is both NP-hard and coNP-hard. 
We do not know whether stronger lower bounds for its complexity can be 
found in the literature. An extensive search for such lower bounds and other 
results concerning this halting problem or a similar halting problem has been 
unsuccessful. 
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9 Concluding Remarks 


We have described finite instruction sequences, containing only instructions 
to set and get the content of Boolean registers, forward jump instructions, 
and a termination instruction, that compute the function that models mul- 
tiplication on natural numbers less than 2% with respect to their binary 
representation by N-bit words according to the long multiplication algo- 
rithm and the Karatsuba multiplication algorithm. We have described those 
instruction sequences by means of terms of PGA, an algebraic theory of 
single-pass instruction sequences. 


Thus, we have provided mathematically precise alternatives to the 
natural language and pseudo code descriptions of these multiplication al- 
gorithms found in mathematics and computer science literature on multi- 
plication algorithms. Moreover, we have calculated the exact size of the 
instruction sequence LMULy expressing the long multiplication algorithm 
and lower and upper estimates for the size of the instruction sequence 
KMULw expressing the Karatsuba multiplication algorithm. The results 
following from the calculated sizes include: (a) len(LMULy) = @(N?) and 
len(KMULn) = O(N'82(3)); (b) N > 28 if len(LMULy) > len(KMULy), 
and len(ZMULy) > len(KMULy) if N > 2!8. It is suggested by (a) that 
instruction sequence size and computation time are polynomially related 
measures. It is still an open question whether this is the case. 


As a bonus, we have found that the number of auxiliary registers used 
by LMULEy is 4-N-+1 and the number of auxiliary registers used by KMULn 
is 10- N - flogo(N — 2)] + 18-N +1. It is also an open question whether 
the number of auxiliary registers that are used by an instruction sequence 
and computation space are related measures. 


We have also gone into the use of an instruction sequence with backward 
jump instructions for expressing the long multiplication algorithm. We have 
described a finite instruction sequence LMUL', containing a backward jump 
instruction, in addition to the instructions to set and get the content of 
Boolean registers, forward jump instructions, and a termination instruction, 
that expresses a minor variant of the long multiplication algorithm. We have 
calculated the exact size of this instruction sequence and have found that: 
(a) len(ZMUL‘,) = O(N); (b) len(ZMUL‘,) < len(ZMULn) if N > 1, and 
(c) len(ZMUL‘,) < len(KMULy) if N > 2. Furthermore, we have related 
these findings to the halting problem. 
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